Polls and data analysis have always been widely used during electoral campaigns; the recent spread of the internet has allowed access to large amounts of new data. In this scenario, many studies have shown the elevate predictive capacity of Google Trends, as a tool for making data prediction either in social, economic or health field.
Inspired by a Prado-Román’s study - (Google Trends as a Predictor of Presidential Elections: the United States versus Canada), this project proposes to testing the hypothesis that the Google Trends tool have an elevate predictive capacity in anticipate the winner of the elections. The aim of this little study is to demonstrate the ability of Google Trends as a predictor of the winner of the Italian parliamentary election of 2022.
In order to get the data I need for my analysis, I used the Google Trends API provided by Google, which allows to get data about search volume for single search terms or comparisons, over a selected time period. The results are return in a standardized measure: Google assign a measure of popularity to search terms, scaled between 0 and 100.
To interact with Google Trends API I used the gtrendsR
package (more
info), that allows to get data from Google Trends and displays them
into a dataset with many information about interest over time (search
volume), interest by country, region or city, related topics and related
queries. For my purpose, I only used the data about
interest over time, which contain information about the
search volume for the single search terms I am interest in.
In this project I decided to analyze the data from the July 21st 2022, the day news elections were announced by President Mattarella, and the September 25th 2022, the “election day”: they are the months of electoral campaigns.
For the analysis I transformed the variable hits into a
numeric variable hit_score, recoding the value “< 1”
into zero.
At first, I chose to explore the single research terms for each
relevant political actor of the 2022 election: I compared the
surname and name + surname of the party
leader, the party name and sometimes the party’s
acronym.
For Matteo Salvini I decided to include also the old party name “Lega Nord”, that sometimes results more searched than the actual name “Lega per Salvini premier”.
For Enrico Letta and Giuseppe Conte I chose to include also the acronym of the party, which are the most used in the common language and also by the media. Moreover, I decide to analyze Matteo Renzi and Carlo Calenda at the same time, since they ran together in the elections; I also included the term “terzo polo” which is an informal way used by the media to indicate their coalition.
It is interesting to note that the surnames are usually the most searched term for each actor, expect for Enrico Letta: we can see that the acronym “pd” was more searched than “Letta”. we can hypothesize that it is due to a minor personalization of his party, compared with the populist parties in the competition.